A BoW-equivalent Recurrent Neural Network for Action Recognition

نویسندگان

  • Alexander Richard
  • Juergen Gall
چکیده

Bag-of-words (BoW) models are widely used in the field of computer vision. A BoW model consists of a visual vocabulary that is generated by unsupervised clustering the features of the training data, e.g., by using kMeans. The clustering methods, however, struggle with large amounts of data, in particular, in the context of action recognition. In this paper, we propose a transformation of the standard BoW model into a neural network, enabling discriminative training of the visual vocabulary on large action recognition datasets. We show that our model is equivalent to the original BoW model but allows for the application of supervised neural network training. Our model outperforms the conventional BoW model and sparse coding methods on recent action recognition benchmarks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bag-of-words input for long history representation in neural network-based language models for speech recognition

In most of previous works on neural network based language models (NNLMs), the words are represented as 1-of-N encoded feature vectors. In this paper we investigate an alternative encoding of the word history, known as bag-of-words (BOW) representation of a word sequence, and use it as an additional input feature to the NNLM. Both the feedforward neural network (FFNN) and the long short-term me...

متن کامل

Designing Path for Robot Arm Extensions Series with the Aim of Avoiding Obstruction with Recurring Neural Network

In this paper, recurrent neural network is used for path planning in the joint space of the robot with obstacle in the workspace of the robot. To design the neural network, first a performance index has been defined as sum of square of error tracking of final executor. Then, obstacle avoidance scheme is presented based on its space coordinate and its minimum distance between the obstacle and ea...

متن کامل

Recurrent Neural Network Language Model with Incremental Updated Context Information Generated Using Bag-of-Words Representation

Recurrent neural network language model (RNNLM) is becoming popular in the state-of-the-art speech recognition systems. However, it can not remember long term patterns well due to a so-called vanishing gradient problem. Recently, Bag-of-words (BOW) representation of a word sequence is frequently used as a context feature to improve the performance of a standard feedforward NNLM. However, the BO...

متن کامل

A bag-of-words equivalent recurrent neural network for action recognition

The traditional bag-of-words approach has found a wide range of applications in computer vision. The standard pipeline consists of a generation of a visual vocabulary, a quantization of the features into histograms of visual words, and a classification step for which usually a support vector machine in combination with a non-linear kernel is used. Given large amounts of data, however, the model...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015